Data Exploration in SQL Server
Scroll down to the bottom of the page and select 'View Project Code' to view the SQL code on GitHub
About the Data
The dataset explored was real-world COVID data - please click here to access the dataset.Methodology
Data exploration is the first step in data analysis in which the data is explored and visualised to uncover insights and to identify areas which could be further investigated. Starting the data analysis process with data exploration ensures the user can make better decisions and gives them a broader understanding of the dataset allowing them to ask more thoughtful questions. To learn about the Tableau project which was carried out in conjuction with this project to visualise findings, please click here.The areas of exploration are listed below along with the associated SQL techniques that were used to gather the data - (NOTE: only the more advanced SQL syntax used in the code is listed to display level of expertise. Where no items are listed assume basic DDL and/or DML) :
1) Total Cases vs Total Deaths - likelihood of dying if you contract the virus in your country
2) Total Cases vs Population - What percentage got COVID
3) Countries with the highest infection rate compared to population
- GROUP BY
- ORDER BY
- MAX()
- MAX()
- CAST()
- ORDER BY
- GROUP BY
- MAX()
- CAST()
- ORDER BY
- GROUP BY
- MAX()
- CAST()
- SUM()
- ORDER BY
- GROUP BY
- CAST()
- SUM()
- ORDER BY
- GROUP BY
- JOIN
- PARTITION BY()
- MAX()
- CAST()
- CONVERT()
- SUM()
- ORDER BY
- GROUP BY
8a) % Vaccinated - CTE
- WITH - (for creation of CTE)
- All items in 8
- INSERT INTO
- CREATE TABLE
- DROP TABLE IF EXISTS
- All items in 8
- CREATE VIEW
- All items in 8